^ Applying these identities, we can write 
S d j = MIN(RowSelectA, RowSelectB) * Xprod 

= 1/102*80 000 / 
p | = 784 rows. 

^ j i Moreover, note that the above equation is equivalent to choosing a maximum single column current 
UEC such that the applied equation can be written in an alternative form: 
S d j = l/MAX(CurUecA , CurUecB) * Xprod 

= 1/102*80 000 

= 784 rows 
where Sdj is the selectivity as defined above. 

It has been found that where skew and possible row and UEC reduction can be ignored this estimate 
provides a much improved estimate of selectivity than one derived assuming complete independence. 
Where such conditions are met, the estimated selectivity of 784 rows is much improved from the 
dramatic underestimate for selectivity of 65 rows obtained using the prior art method. 


A clean version of the paragraph starting at page j^line 13, should read as follows: 

In order to obtain, multi-column histogram information, we apply the following formula to 

each interval shown above: 

(XprodTl.A) * (XprodT2.A) / MAX ( (CurUECTl.A Currcnt UECTl.Al (CurUecT2.A)) / 
(XprodTl.A + XpodT2.A). 

These calculations generate the following joined histogram: 

ColumnTl.A,T2.A 


Interval 

CurlJec 

Rows 

Value 

0 

0 

0 

0 

1 

1 

10,000 

25 

2 

100 

200 

150 


Here we can also calculate row selectivity and UEC selectivity in a similar manner as before: 

4 

RowselA = 10,200/80,000 = 0.1275 
UecselA= 1/102 = 0.0098. 

Comparing the results, we note that approximately 13 times as many rows as the total UEC selectivity 
would have been produced (i.e. RowselA/UecselA = 13.005). It is this type of skew that the join skew 
formula corrects when applying multi-column UEC information. If we applied the multi-column 
formula without correcting for skew we would lose all join skew information. 
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